An Alignment-Free Distance Measure for Closely Related Genomes

نویسندگان

  • Bernhard Haubold
  • Mirjana Domazet-Loso
  • Thomas Wiehe
چکیده

Phylogeny reconstruction on a genome scale remains computationally challenging even for closely related organisms. Here we propose an alignmentfree pairwise distance measure, Kr, for genomes separated by less than approximately 0.5 mismatches/nucleotide. We have implemented the computation of Kr based on enhanced suffix arrays in the program kr, which is freely available from guanine.evolbio.mpg.de/kr/. The software is applied to genomes obtained from three sets of taxa: 27 primate mitochondria, eight Staphylococcus agalactiae strains, and 12 Drosophila species. Subsequent clustering of the Kr values always recovers phylogenies that are similar or identical to the accepted branching order.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances

Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cann...

متن کامل

Alignment-Free Genome Tree Inference by Learning Group-Specific Distance Metrics

Understanding the evolutionary relationships between organisms is vital for their in-depth study. Gene-based methods are often used to infer such relationships, which are not without drawbacks. One can now attempt to use genome-scale information, because of the ever increasing number of genomes available. This opportunity also presents a challenge in terms of computational efficiency. Two funda...

متن کامل

Genomic Classification Using an Information-Based Similarity Index: Application to the SARS Coronavirus

Measures of genetic distance based on alignment methods are confined to studying sequences that are conserved and identifiable in all organisms under study. A number of alignment-free techniques based on either statistical linguistics or information theory have been developed to overcome the limitations of alignment methods. We present a novel alignment-free approach to measuring the similarity...

متن کامل

Accurately Measuring Recombination between Closely Related HIV-1 Genomes

Retroviral recombination is thought to play an important role in the generation of immune escape and multiple drug resistance by shuffling pre-existing mutations in the viral population. Current estimates of HIV-1 recombination rates are derived from measurements within reporter gene sequences or genetically divergent HIV sequences. These measurements do not mimic the recombination occurring in...

متن کامل

Estimating Mutation Distances from Unaligned Genomes

Abstract Alignment-free distance measures are generally less accurate but more efficient than traditional alignment-based metrics. In the context of genome sequence analysis, the efficiency gain is often so substantial that it outweights the loss in accuracy. However, a further disadvantage of alignment-free distances is that their relationship to evolutionary events such as substitutions is ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008